Setup

The alignments in this analysis were generated by aligning each library (including technical replicates) to the Zebrafish transcriptome from Ensembl Release 94 (GRCz11) using kallisto (v0.43.1). In addition to the standard transcriptome, the two mutant psen2 transcripts were manually added to the reference.

The corresponding set of gene descriptions were then loaded into R as an EnsDb object using the AnnotationHub() infrastructure. Likewise, the set of transcript descriptions were loaded, with the manual addition of the two novel psen2 mutants.

Gene-level Counts

Gene-level counts were imported using tximport, mapping transcripts to genes. Some genes exist in the primary assembly and on alternate assemblies for specific regions, and these were considered as separate transcripts of the same gene for read summarisation purposes. Transcript counts were thus mapped to genes using the gene symbol (e.g. psen2), instead of the gene id.

Genes were retained for analysis if a CPM > 1 was observed for \(\geq\) 5 samples. This equated to about 31 reads for a gene in at least 5 samples for inclusion in downstream analysis, giving a total of 18,809 of the original genes for DGE analysis.

*Total counts from each library after assigning to genes*

Total counts from each library after assigning to genes

Counts were also processed using the voom transformation using quality weights to allow for analysis using normal-based algorithms. Sample weights ranged between 0.4184 and 1.392, with the most strongly down-weighted being a WT sample.

Transcript-level Counts

Transcript-level counts were imported using catchKallisto() from edgeR in order to utilise the voom transformation on transcript-level counts.

*Sample weights using transcript-level counts, showing near identical patterns to those observed at the gene-level.*

Sample weights using transcript-level counts, showing near identical patterns to those observed at the gene-level.

Genotype checks

*CPM values for each psen2 transcript across all samples.*

CPM values for each psen2 transcript across all samples.

Transcript abundances (using CPM) were calculated for each of the three psen2 transcripts, and showed expected patterns of heterozygous expression for FAD samples and all WT expression for the WT samples. However for sample 8_FS_4, no WT allele was detected which is quite inexplicable, and this sample should be excluded from all analyses. The remaining FS samples showed reduced abundance of the FS transcript, as expected under NMD. No increases in expression of the WT allele were evident, supporting a lack of genetic compensation.

This sample was then removed from all objects, along with 12_WT_4 which had been consistently down-weighted.

Data Inspection

The next step was to perform an MDS analysis. However, minimal separation was observed between sample groups, A simple PCA also revealed that the first few principal components capture less of the total variability than might be expected,

MDS plot showing no clear groups within the data. Point sizes indicate sample weights as calculated by voomWithQualityWeights().

First five principal components, showing that the first two only account for 32.5% of the total variance, which is below expectations
  PC1 PC2 PC3 PC4 PC5
Standard deviation 22.39 18.64 17.46 16.3 14.52
Proportion of Variance 0.1921 0.1332 0.1169 0.1019 0.08084
Cumulative Proportion 0.1921 0.3254 0.4422 0.5441 0.625

DGE Analysis

Design

Three comparisons were defined with the first two being the difference between the two mutant families and the wild-type samples. The third comparison was defined as being between the two mutant groups.

FS Vs WT

The first analysis was comparing psen2N140fs/+ samples to psen2+/+ samples. A total of 4 genes were potentially detected as differentially expressed using an FDR of 5%. In the following plots, a negative value for logFC corresponds to decreased expression in the heterozygous mutants.

*MD plot for psen2^N140fs/+^ samples compared to psen2^+/+^ samples*

MD plot for psen2N140fs/+ samples compared to psen2+/+ samples

*Volcano plot for psen2^N140fs/+^ samples compared to psen2^+/+^ samples*

Volcano plot for psen2N140fs/+ samples compared to psen2+/+ samples

10 most highly ranked genes in the comparison between psen2N140fs/+ samples and psen2+/+ samples
Symbol logFC AveExpr P.Value FDR
CABZ01035279.1 -9.703 0.2326 1.073e-07 0.001082
psen2 -0.6214 4.58 1.15e-07 0.001082
atxn1l -0.7366 2.467 5.475e-06 0.0331
ptcd1 0.8665 2.503 7.04e-06 0.0331
CU179663.1 -0.9124 3.7 2.09e-05 0.07861
BX649405.1 -1.044 2.182 4.181e-05 0.1311
lrrc4ba -0.4274 4.802 9.381e-05 0.2293
mybpc2b 5.453 1.16 1e-04 0.2293
si:ch211-132g1.3 -0.4373 5.538 0.0001097 0.2293
pcnp 0.4989 5.561 0.0001584 0.2979
*Expression patterns for significantly DE genes in the comparison between psen2^N140fs/+^ samples and psen2^+/+^ samples. Values are given as CPM using an offset of 1 to avoid zeroes, with the y-axis being displayed on the log scale.*

Expression patterns for significantly DE genes in the comparison between psen2N140fs/+ samples and psen2+/+ samples. Values are given as CPM using an offset of 1 to avoid zeroes, with the y-axis being displayed on the log scale.

*Expression patterns for the next most highly ranked genes in the comparison between psen2^N140fs/+^ samples and psen2^+/+^ samples, but which are not formally considered as DE. Values are given as CPM using an offset of 1 to avoid zeroes, with the y-axis being displayed on the log scale.*

Expression patterns for the next most highly ranked genes in the comparison between psen2N140fs/+ samples and psen2+/+ samples, but which are not formally considered as DE. Values are given as CPM using an offset of 1 to avoid zeroes, with the y-axis being displayed on the log scale.

FAD Vs WT

The next analysis was comparing psen2T141_L142delinsMISLISV/+ samples to psen2+/+ samples. No genes could be considered as DE using an FDR anywhere up to 50%. In the following plots, a negative value for logFC corresponds to decreased expression in the heterozygous mutants.

*MD plot for psen2^T141_L142delinsMISLISV/+^ samples compared to psen2^+/+^ samples*

MD plot for psen2T141_L142delinsMISLISV/+ samples compared to psen2+/+ samples

*Volcano plot for psen2^T141_L142delinsMISLISV/+^ samples compared to psen2^+/+^ samples*

Volcano plot for psen2T141_L142delinsMISLISV/+ samples compared to psen2+/+ samples

10 most highly ranked genes in the comparison between psen2T141_L142delinsMISLISV/+ samples and psen2+/+ samples
Symbol logFC AveExpr P.Value FDR
tnk2a 0.272 5.581 2.406e-05 0.4526
si:ch211-56a11.2 1.049 1.22 5.109e-05 0.4805
si:ch73-236c18.2 1.167 2.086 0.0001036 0.5041
BX548026.1 -0.7806 0.9622 0.000115 0.5041
atxn1l -0.5015 2.467 0.000134 0.5041
cdc20 -0.7455 1.258 0.0004181 0.738
arhgap11a -0.9871 0.06342 0.000424 0.738
celsr1b -0.2806 5.409 0.0005467 0.738
ncapd2 -0.3511 3.414 0.0006129 0.738
stn1 0.4679 3.125 0.0006327 0.738
*Expression patterns for the 5 most highly ranked genes in the comparison between psen2^T141_L142delinsMISLISV/+^ samples and psen2^+/+^ samples. None were considered as DE. Values are given as CPM using an offset of 1 to avoid zeroes, with the y-axis being displayed on the log scale.*

Expression patterns for the 5 most highly ranked genes in the comparison between psen2T141_L142delinsMISLISV/+ samples and psen2+/+ samples. None were considered as DE. Values are given as CPM using an offset of 1 to avoid zeroes, with the y-axis being displayed on the log scale.

FAD Vs FS

The final analysis was comparing psen2T141_L142delinsMISLISV/+ samples to psen2N140fs/+ samples. A total of 2 genes were potentially detected as differentially expressed using an FDR of 5%. In the following plots, a negative value for logFC corresponds to decreased expression in psen2T141_L142delinsMISLISV/+ samples, whilst a positive value for logFC corresponds to increased expression in psen2T141_L142delinsMISLISV/+ samples.

*MD plot for psen2^T141_L142delinsMISLISV/+^ samples compared to psen2^N140fs/+^ samples*

MD plot for psen2T141_L142delinsMISLISV/+ samples compared to psen2N140fs/+ samples

*Volcano plot for psen2^T141_L142delinsMISLISV/+^ samples compared to psen2^N140fs/+^ samples*

Volcano plot for psen2T141_L142delinsMISLISV/+ samples compared to psen2N140fs/+ samples

10 most highly ranked genes in the comparison between psen2T141_L142delinsMISLISV/+ samples and psen2N140fs/+ samples
Symbol logFC AveExpr P.Value FDR
psen2 0.7094 4.58 1.495e-08 0.0002811
CABZ01035279.1 8.766 0.2326 2.231e-07 0.002098
CU179663.1 0.9466 3.7 9.56e-06 0.05814
si:ch211-114l13.4 1.651 2.089 1.236e-05 0.05814
si:ch73-236c18.2 1.448 2.086 1.58e-05 0.05942
si:ch211-114l13.3 2.008 -0.1738 2.307e-05 0.07232
BX649405.1 0.9933 2.182 4.618e-05 0.1241
tnk2a 0.244 5.581 8.528e-05 0.1808
CABZ01084501.2 0.6405 3.94 8.651e-05 0.1808
mcoln1a -0.5633 4.208 0.0001042 0.1829
*Expression patterns for significantly DE genes in the comparison between psen2^T141_L142delinsMISLISV/+^ samples and psen2^N140fs/+^ samples. This is essentially a subset of the previously identified genes*

Expression patterns for significantly DE genes in the comparison between psen2T141_L142delinsMISLISV/+ samples and psen2N140fs/+ samples. This is essentially a subset of the previously identified genes

*Expression patterns for the next most highly ranked genes in the comparison between psen2^T141_L142delinsMISLISV/+^ samples and psen2^N140fs/+^ samples, but which are not formally considered as DE*

Expression patterns for the next most highly ranked genes in the comparison between psen2T141_L142delinsMISLISV/+ samples and psen2N140fs/+ samples, but which are not formally considered as DE

Differential Transcript Expression

As the level of transcript complexity is less in zebrafish than human, and 1:1 mapping between species is less robust, only a brief analysis was performed at the transcript level. In essence, the same genes were found as the most highly ranked, with changes in expression of psen2 transcripts detected as expected, providing a form of positive control. Following the top tables, the basic transcript expression patterns are shown for three possible genes of interest. Notably, the transcripts showing the strongest differential expression are expressed at very low-levels for both si:ch211-132g1.3 and slc37a4b.

10 most highly ranked transcripts in the comparison between psen2N140fs/+ samples and psen2+/+ samples
Transcript Symbol logFC AveExpr P.Value FDR gene_id
ENSDART00000137332 si:ch211-132g1.3 -6.333 -1.451 2.801e-07 0.008417 ENSDARG00000089477
ENSDART00000187524 CABZ01035279.1 -8.439 -0.1248 5.656e-07 0.008498 ENSDARG00000116774
ENSDART00000127351 atxn1l -0.7309 2.88 5.628e-06 0.05638 ENSDARG00000086977
ENSDART00000114613 ptcd1 0.8702 2.581 8.968e-06 0.06737 ENSDARG00000076176
psen2N140fs psen2 3.401 -4.421 1.931e-05 0.116 ENSDARG00000015540
ENSDART00000185608 si:ch211-160d14.6 -6.728 -1.329 2.352e-05 0.1178 ENSDARG00000115710
ENSDART00000188158 BX649405.1 -1.041 2.11 6.452e-05 0.273 ENSDARG00000112605
ENSDART00000143184 mybpc2b 5.446 1.012 7.267e-05 0.273 ENSDARG00000021265
ENSDART00000006381 psen2 -0.9832 1.357 8.181e-05 0.2731 ENSDARG00000015540
ENSDART00000182716 actb1 -4.621 -2.578 9.602e-05 0.2885 ENSDARG00000113649
10 most highly ranked transcripts in the comparison between psen2T141_L142delinsMISLISV/+ samples and psen2+/+ samples
Transcript Symbol logFC AveExpr P.Value FDR gene_id
psen2T141_L142delinsMISLISV psen2 6.452 -2.98 1.806e-11 5.427e-07 ENSDARG00000015540
ENSDART00000168837 fam168b 1.906 2.399 3.9e-05 0.3795 ENSDARG00000101733
ENSDART00000006381 psen2 -0.9714 1.357 4.901e-05 0.3795 ENSDARG00000015540
ENSDART00000144157 si:ch211-56a11.2 1.025 1.713 5.052e-05 0.3795 ENSDARG00000093677
ENSDART00000186112 cntnap2a 0.3749 3.807 9.329e-05 0.4123 ENSDARG00000058969
ENSDART00000180982 hs6st1b 6.271 -2.079 0.0001158 0.4123 ENSDARG00000116688
ENSDART00000168762 si:ch73-236c18.2 1.184 0.9583 0.0001202 0.4123 ENSDARG00000103829
ENSDART00000172408 arhgap11a -1.641 -0.7479 0.0001313 0.4123 ENSDARG00000100019
ENSDART00000127351 atxn1l -0.5008 2.88 0.000136 0.4123 ENSDARG00000086977
ENSDART00000091529 wasf3b -0.4363 5.912 0.0001372 0.4123 ENSDARG00000062948
10 most highly ranked transcripts in the comparison between psen2T141_L142delinsMISLISV/+ samples and psen2N140fs/+ samples
Transcript Symbol logFC AveExpr P.Value FDR gene_id
psen2T141_L142delinsMISLISV psen2 6.428 -2.98 2.399e-11 7.208e-07 ENSDARG00000015540
ENSDART00000137332 si:ch211-132g1.3 5.577 -1.451 7.251e-07 0.01089 ENSDARG00000089477
ENSDART00000187524 CABZ01035279.1 7.413 -0.1248 1.465e-06 0.01467 ENSDARG00000116774
ENSDART00000150193 slc37a4b -1.158 -0.05288 8.973e-06 0.06741 ENSDARG00000077180
psen2N140fs psen2 -3.401 -4.421 1.187e-05 0.07134 ENSDARG00000015540
ENSDART00000168762 si:ch73-236c18.2 1.452 0.9583 1.894e-05 0.09485 ENSDARG00000103829
ENSDART00000147678 si:dkey-222h21.2 2.015 0.6667 3.399e-05 0.1445 ENSDARG00000094297
ENSDART00000141678 si:ch211-114l13.3 1.96 -1.02 3.847e-05 0.1445 ENSDARG00000094346
ENSDART00000188136 CABZ01084501.2 0.6338 4.433 7.284e-05 0.2116 ENSDARG00000113332
ENSDART00000185608 si:ch211-160d14.6 5.708 -1.329 7.721e-05 0.2116 ENSDARG00000115710